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ABSTRACT 

We present wide-field near-infrared J and K s images of the Andromeda Galaxy taken with WIRCam on 
the Canada-France-Hawaii Telescope (CFHT) as part of the Andromeda Optical and Infrared Disk Survey 
(ANDROIDS). This data set allows simultaneous observations of resolved stars and NIR surface brightness 
across M31's entire bulge and disk (within R = 22 kpc), permitting a direct test of the stellar composition of 
near-infrared light in a nearby galaxy. Our survey complements the similar Panchromatic Hubble Andromeda 
Treasury survey by covering M3 1 's entire disk, rather than a single quadrant, at similar wavelengths, albeit with 
lower spatial resolution. The primary concern of this work is the development of NIR observation and reduction 
methods to recover a uniform surface brightness map across the 3° x 1° disk of M31. This necessitates sky- 
target nodding across 27 WIRCam fields. Two sky-target nodding strategies were tested, and we find that 
strictly minimizing sky sampling latency does not maximize sky subtraction accuracy, which is at best 2% of 
the sky level. The mean surface brightness difference between blocks in our mosaic can be reduced from 1% 
to 0.1% of the sky brightness by introducing scalar sky offsets to each image. We test the popular Montage 
package, and also develop an independent method of estimating sky offsets using simplex optimization; we 
show these two optimization schemes to differ by up to 0.5 mag arcsec" 2 in the outer disk. We find that 
planar sky offsets are not acceptable for subtracting residual backgrounds across WIRCam fields that are much 
smaller than the mosaic area. The true surface brightness of M31 can be known to within a statistical zeropoint 
of 0. 15% of the sky level (0.2 mag arcsec" 2 uncertainty at R = 15 kpc). We also find that the surface brightness 
across individual WIRCam frames is limited by both WIRCam flat field evolution and residual sky background 
shapes. To overcome flat field variability of order 1% over 30 minutes, we find that WIRCam data should 
be calibrated with real-time sky flats. Due either to atmospheric or instrumental variations, the individual 
WIRCam frames have typical residual shapes with amplitudes of 0.2% of the sky after real-time flat fielding 
and median sky subtraction. We present our WIRCam reduction pipeline and performance analysis here as a 
template for future near-infrared observers needing wide-area surface brightness maps with sky-target nodding, 
and give specific recommendations for improving photometry of all CFHT/WIRCam programs. 
Subject headings: methods: observational, techniques: photometric 



1. INTRODUCTION 

Thanks largely to its proximity, kinship with the Milky 
Way, and being an ideal foil for modern galaxy formation 
models, the Andromeda galaxy has been the focus of numer- 
ous investigations of galaxy structure (Ibata et al. 2005; Irwin 
et al. 2005; McConnachie et al. 2009; Courteau et al. 2011) 
and stellar populations (Williams 2002; Worthey et al. 2005; 
Saglia et al. 2010). The shapes, ages, kinematics and relative 
fraction of galaxy components (bulge, disk, halo) in large spi- 
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ral galaxies like our own reveal precious information about 
their formation, accretion, and merging histories (see the re- 
view of Kormendy & Kennicutt 2004). 

Unfortunately, the fundamental task of disentangling 
galaxy components — which typically involves light profile 
decompositions, colour gradients and mass modelling — 
remains non-trivial. Stellar population studies of spiral galax- 
ies are thwarted by a three-fold degeneracy between stellar 
age (A), metallicity (Z) and ISM dust that can only be lifted by 
combining, at the very least, optical and infrared images with 
realistic dust models (de Jong 1996; MacArthur et al. 2004; 
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Pforr et al. 2012). Likewise, mass models of spiral galaxies 
suffer a degeneracy between the stellar M/L and dark halo 
parameters that requires, in addition to an extended rotation 
curve, deep and accurate multi-band imaging to uniquely con- 
strain the stellar M/L ratio (Dutton et al. 2005). 

Compounding these challenges are fundamental uncertain- 
ties in modern stellar population synthesis, particularly uncer- 
tainties in the interpretation of near-infrared (NIR) light. In 
the Galaxy and Mass Assembly (GAMA) survey, Taylor et al. 
(2011) combined SDSS and UKIRT ugrizYJHK photometry 
of galaxies to show that optical-NIR SEDs yield unreliable 
population synthesis fits compared to optical-only SED fits. 
This failure is largely attributable to inadequate stellar popu- 
lation synthesis recipes for NIR bands and naive parameteri- 
zation of star formation histories. 

First, spectral energy distribution (SED) fitting often relies 
on simplistic star formation history (SFH) parameterizations. 
Because NIR colours lift age-metallicity-dust degeneracies, 
modelling of NIR bands may require additional sophistica- 
tion, namely composite star formation and metal enrichment 
histories. The appropriate form of SFH models cannot be 
constrained from the integrated light of galaxies alone (as 
is typically attempted); resolved colour-magnitude diagrams 
(CMDs) are both more effective and, in fact, essential at de- 
riving non-parametric stellar population histories. 

Second, the NIR light is dominated by thermally-pulsating 
AGB (TP-AGB) stars from intermediate-aged stellar popula- 
tions. Modelling TP-AGB stars is most challenging due to 
their complex dredge-up cycles that change surface chemistry 
and temperature (the M- to C-type transition), and circum- 
stellar winds that further perturb an AGB star's location in the 
CMD. A proper calibration of NIR stellar population synthe- 
sis models {e.g., Maraston 2005, Chariot & Bruzual in prep.) 
will yield a 30%-50% improvement in the estimation of stel- 
lar masses and ages of high redshift systems {e.g., Maras- 
ton et al. 2006; Bruzual 2007; Conroy & Gunn 2010; Conroy 
2013). 

Our remedy for both understanding the structure of M3 1 , 
and more fundamentally to understand NIR stellar popula- 
tions, is to survey the entire bulge and disk {R < 22 kpc) of 
M3 1 in both resolved and integrated stellar light at J and K s 
wavelengths. In doing so, we can directly relate a NIR stel- 
lar population's decomposition in the colour-magnitude plane 
to the panchromatic SED of M3 1 . Though such a calibration 
could be made with other galaxies, M31 is unique in its prox- 
imity so that even ground-based instrumentation can resolve 
its bright stellar population For reference, 1" = 3.7 pc across 
the disk of M31 (we adopt D M 3i = 785 kpc, McConnachie 
et al. 2005). 

A wealth of photometric data exist for M31, however, none 
provide simultaneous observations of M31's resolved and in- 
tegrated NIR light. The best resource for resolved stellar pop- 
ulations across the entire disk of M31, to date, is the Local 
Group Galaxy survey (LGGS, Williams 2003; Massey et al. 
2006). This survey, although covering the UBVRI wave- 
lengths, does not extend to the JHK NIR wavelengths most 
contentious for stellar population models. Advancing our 
view of M31 to NIR wavelengths, Beaton et al. (2007) assem- 
bled a 2.8° JHK S mosaic of M31 with the 2MASS 6X pro- 
gram. Those observations, the first nearly dust-free of M31's 
stellar content, were used by Athanassoula & Beaton (2006) 
as evidence for a bar embedded in a classical bulge. Beyond 
the bulge, the 2MASS 6X images have limited utility; sky un- 
certainties restrict their use to infer structural and photometric 



properties of the disk. Further, the pixel scale of 1" and inte- 
gration depth of 46.8 seconds prevent point source measure- 
ment of M31 stars in the 2MASS 6X images. As a result, the 
state-of-the-art NIR view of M31 is the slightly longer 3.6 /im 
Spitzer/IRAC map of Barmby et al. (2006). Spitzer reduces 
the uncertainties of sky estimation, though the pixel scale of 
0"86 also prevents point source measurements of individual 
stars in the M3 1 disk. 

Decompositions of M31's stellar populations have been 
made with high-resolution Hubble Space Telescope (HST) 
observations (Brown et al. 2003, 2006, 2008). These authors 
detected an intermediate age population in the inner halo of 
M31, and even disentangled debris associated with the Gi- 
ant Stream around M31 in six fields sampling the outer disk, 
halo, and Giant Stream around M31 (Brown et al. 2009). Sim- 
ilarly, though in the near-infrared, Olsen et al. (2006) derived 
star formation histories within 22."5 x 22"5 fields in M31's 
inner disk and bulge with ground-based adaptive optics ob- 
servations with the Gemini ALTAIR/NIRI instrument. Yet 
these pencil beam surveys cannot be construed as represen- 
tative of the entire Andromeda Galaxy. Thus the boldest step 
forward in understanding M31's stellar populations is com- 
ing from the Panchromatic Hubble Andromeda Treasury Sur- 
vey (PHAT, Dalcanton et al. 2012). PHAT provides wide- 
field coverage from M31's centre to the 10 kpc star forming 
ring, a panchromatic view of stellar populations from 3000A- 
17000A, and resolution of stars in very crowded environments 
such as the bulge. Despite its many enviable traits, PHAT only 
covers a single quadrant of M31, making any global conclu- 
sions about M3 1 's stellar populations incomplete. Further, the 
wavelength coverage of HST/WF3 falls short of the 2.2 fim K- 
band, making empirical calibration of the common NIR bands 
used by wide-field NIR surveys {JHK) impossible. Thus there 
is good cause, even in the era of PHAT, to revisit M3 1 with a 
ground-based survey that covers the entire M3 1 disk at NIR 
wavelengths, while using the best natural seeing in the north- 
ern hemisphere (on Mauna Kea) to resolve stars even more 
effectively than the previous ground-based survey of M31's 
disk (LGGS). 

We present such a survey, the first instalment of the An- 
dromeda Optical and Infrared Disk Survey (ANDROIDS), in 
this paper. ANDROIDS uses the WIRCam instrument (Puget 
et al. 2004) on the Canada-France-Hawaii Telescope (CFHT), 
which is among the first generation of wide-field ground- 
based NIR detector arrays, covering a 21(5 x 21.'5 field of 
view. Indeed, the advent of detectors such as WIRCam makes 
such a wide-field, high-resolution survey of an object as vast 
as M31 possible. The excellent natural seeing on Mauna Kea 
of 0"65 is sufficient for resolving giant-branch stars through- 
out the disk of M31. 

Recovering the true NIR surface brightness map of M31 
is, however, more technically challenging. The NIR sky is 
~ 10 3 x brighter than the NIR surface brightness of M31 at 
R = 20 kpc, demanding exceptionally careful sky background 
characterization. Whereas most NIR galaxy surveys can mea- 
sure the instantaneous background blank sky pixels surround- 
ing the galaxy on a detector array, M3 1 's size requires physi- 
cally nodding of the telescope away from the galaxy by l°-3° 
to sample blank sky (called Sky-Target, or ST, nodding). That 
we can never observe the instantaneous sky emission on the 
disk of M31, but rather sample the sky at both a different lo- 
cation and time, introduces additional complications. Adams 
& Skrutskie (1996) clearly showed, with 9° x 9° movies of 
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the sky, that NIR sky emission has coherent spatial structure 
that moves across the sky, akin to a cirrus cloud system. This 
assures that sky sampled from a sky field will not correspond 
directly to the sky affecting disk observations. 

Another concern is the accuracy of the surface brightness 
shape across individual WIRCam fields of view. Spatial struc- 
tures in the NIR sky can leave residual sky shapes in back- 
ground subtracted disk images that ultimately affect our abil- 
ity to produce a seamless NIR mosaic of M3 1 . Vaduvescu & 
McCall (2004) also found that detector systems themselves, in 
their case the (now decommissioned) CFHT-IR camera, can 
add a time-varying background signal whose strength may be 
comparable to the NIR surface brightness of the outer M31 
disk. 

Because such a large mosaic has never before been assem- 
bled in a ST nodding WIRCam program, we focus this con- 
tribution on engineering the best practices for this type of ob- 
serving. This includes: finding the optimal ST nodding ca- 
dence, defining the appropriate data reduction procedures for 
a WIRCam surface brightness reduction, and finally present- 
ing an analysis of the surface brightness accuracy in wide- 
field WIRCam mosaics. 

Section 2 describes the novel observational strategies used 
to reduce sky subtraction uncertainties. Section 3 presents 
the image reduction pipeline; with night sky flat fielding in 
Section 4, median sky subtraction in Section 5, and zeropoint 
calibration practises in Section 6. In Section 7 we present our 
method for recovering the galaxy surface brightness by min- 
imizing image-to-image differences across the mosaic, while 
in Section 8 we analyze the results of this algorithm. We es- 
timate the systematic uncertainties in our mosaic solution in 
Section 9, where we also compare our technique to the Mon- 
tage package (Berriman et al. 2008) and the Spitzer/IRAC mo- 
saics. In Section 10 we consider the accuracy of the surface 
brightness shapes recovered by our pipeline. Ultimately we 
seek the observation and reduction method that maximizes 
surface brightness accuracy. Finally in Section 1 1 we sum- 
marize the uncertainty of NIR sky subtraction on the scale of 
M3 1 and outline our ideal observation and reduction method. 

2. OBSERVATIONS 

The Andromeda Galaxy (M31) was observed in the NIR 
using the WIRCam instrument, mounted to the 3.6-meter 
Canada-France-Hawaii Telescope (CFHT), at the summit of 
Mauna Kea in Hawaii. Observations were carried out in the 
NIR J (Ao ~ 1 .2/im) and {Xq ~ 2.2/im) bands. 

WIRCam itself is an array of four HgCdTe HAWAII-RG2 
detectors (Puget et al. 2004). Each detector comprises 2048 x 
2048 pixels, with a scale of 0"3 pix" 1 . This pixel scale criti- 
cally samples the typical seeing of 0"65 at CFHT. WIRCam's 
detectors are arranged in a 2 x 2 grid with 45" gaps, so that 
the entire instrument covers 21.5' x 21.5' of sky. It is truly 
the advent of NIR focal plane arrays, like WIRCam, that have 
enabled wide-field studies of M31 in the NIR. 

The ANDROIDS WIRCam survey is designed to simultane- 
ously resolve stars and recover the integrated surface bright- 
ness of the entire M31 disk. As discussed in §1, NIR obser- 
vations require frequent monitoring of the sky background. 
Vaduvescu & McCall (2004) found, for example, that the NIR 
sky intensity can vary by 0.5% per minute; yet the low sur- 
face brightness of M31's NIR disk at R = 20 kpc requires 
us to constrain the sky brightness to approximately 0.01%. 
Since M31, with a 190' x 60' optical disk, is much larger than 
the WIRCam fields of view, monitoring of the sky zeropoint 
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Figure 1. ANDROIDS WIRCam field positions on M31. Central blue fields 
are the 27 disk fields observed in 2007B, surrounded by 4 sky fields (blue 
with solid black outlines). Red fields at center are the 12 disk fields observed 
in 2009B. The red ring of 53 fields is the 2009B sky sampling ring. The 
dashed yellow ellipse marks the M31 disk at R = 20 kpc along the major axis. 
Coordinates are centered on the nucleus of M3 1 with North up, and East is 
left. 

is only possible by periodically pointing the telescope away 
from M31, towards blank sky, through sky-target (ST) nod- 
ding. The fundamental compromise of ST nodding observa- 
tion programs is to balance the cadence of sky sampling with 
the efficiency of observing the target itself. Although studies 
such as Vaduvescu & McCall (2004), and references therein, 
provide good guidelines for NIR sky behaviour, no program 
has attempted to construct a near-IR surface brightness mo- 
saic covering an area as large and faint as M31's disk. 

In this program, we have the opportunity to experiment with 
different ST nodding strategies since observations were taken 
over the 2007B and 2009B semesters. An objective of this 
study is to determine how observational design can improve 
the construction of a wide-field NIR mosaic by comparing the 
performance of these two observing strategies. 

2.1. 2007B Semester 

The initial survey was carried out in the 2007B semester by 
the CFHT Queue Service Observing under photometric con- 
ditions. Here M31 is covered with 27 contiguous WIRCam 
fields out to the optical radius where /j,y = 23 mag arcsec" 2 at 
R = 2Q kpc. The fields are arranged with at least 1' overlap in 
declination, and approximately 5' overlap in right ascension. 
The field configuration is shown in Fig. 1 . 

As shown in Table 1, each field was integrated for 16 x 
47 s = 12.5 minutes in J and 26 x 25 s = 10.8 minutes in K s . 
These integrations are sufficiently deep for resolved stellar 
photometry to reach at least 1 mag below the tip of the red 
giant branch, a crucial requirement for decomposing the con- 
tributions of red giant and asymptotic giant branch stars to the 
NIR light. 

The 2007B ST nodding strategy was motivated by a canon- 
ical understanding of NIR sky behaviour, since ST nodding 
sky subtraction had never been attempted on this scale before. 
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Table 1 

Summary of WIRCam observing programs. N^at i s me number of WIRCam fields covering in the M3 1 disk in each semester (see Fig. 1 ). ST Nods with 
superscripts denote the number of times an observation is repeated at a given field. 7] nt is the total integration time per disk field while T eX p is the integration 
time per WIRCam exposure. Eff. is the observing efficiency, or percentage of time in a program allocated to integrating the disk of M3 1 , compared to nodding, 
read out and sky overheads, /i^y gives the range (min-max) of sky surface brightnesses seen in each band, for each semester. PSF reports the distribution seeing 
as measured from the full width at half maximum (FWHM) of stellar point spread functions in the uncrowded sky images. 
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Figure 2. Time latency between target observations and sky field sampling 
in the 2007B and 2009B WIRCam observing runs. The 2009B program was 
designed to ensure that no disk sample would be removed by more than 1.5 
minutes from a sky sample by using a STTS nodding pattern. 
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Figure 3. Distance between sky and target observations in the 2007B and 
2009B WIRCam observing runs. The larger nodding distance of 2009B is 
a consequence of sky ring sampling. The maximum nodding distance across 
the sky ring was purposely set to ~ 3° to avoid excessive time overheads (see 
Fig. 1). As such, a given disk field only samples roughly half of the full sky 
ring. 

The NIR sky intensity can be expected to change by 5% in 
10 minutes (Adams & Skrutskie 1996; Vaduvescu & McCall 
2004). Since the sky itself is ~ 10 3 x brighter than the outer 
disk of M31 in the NIR, a 5% uncertainty in the background 
would be fatal to our objective of recovering M31's extended 
NIR surface brightness. To constrain sky brightness to within 
1%, we ensured that a sky sample would be no more than 
5 minutes removed from a M31 target image. Given the re- 
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Figure 4. Sky levels observed in the 2007B and 2009B programs, as applied 
to each target field. 

spective exposure times (chosen so as not to saturate with the 
sky brightness), this implied a sky (S)-target (T) observing 
sequence of S 3 T & S 3 in J and S 5 T n S 5 in K s . 1 Four sky fields 
were chosen (Fig. 1), and each disk field was associated with 
a single sky field. 

2.2. 2009B Semester 

Analysis of the 2007B data revealed that the adopted sky- 
target nodding strategy was not sufficient for recovering the 
M31 surface brightnesses due to uncertainties in the sky back- 
ground. Repeatedly sampling one of only four sky fields also 
proved not ideal. This motivated a 2009B observing campaign 
informed by our experiences. 

Rather than replicate the 28-field footprint of the 2007B 
campaign, we observed 12 new fields (see Fig. 1) that overlap 
each other and all of the 2007B footprints, to form a network 
of well sky subtracted fields. Thus the 2009B observations 
augment and calibrate the 2007B NIR mapping. 

To improve sky subtraction fidelity, we recognized chal- 
lenges not fully appreciated in the 2007B survey design. Not 
only does the sky background change rapidly in time, it pos- 
sess a significant spatial structure on the scale of WIRCam 
fields and larger. This has two ramifications: the sky level 
sampled at a sky field will not necessarily reflect the sky back- 
ground present at the disk, and that the sky background in 
each WIRCam frame has a 2D shape, not simply a scalar 
level. 

1 Superscripts here denote the number of times an observation is repeated 
in sequence for a given target disk field. 
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This resulted in three principle changes to observing strat- 
egy. First, we chose to minimize latency between sky and 
target observations with a ST 2 S pattern. That is, each target 
observation was directly paired with a sky observation taken 
within 1.5 minutes (Fig. 2). 

Second, we also increased the number of repetitions on 
each field, so that each field is observed 40 times in each band 
in a [ST 2 ] 20 S pattern. This repetition enables averaging over 
spatial sky background structures on the scale of WIRCam 
fields. 

Finally, we employ a pseudo-randomized sky-targeting 
nodding pattern where no sky field is used repeatedly for a 
disk field. In order to maintain rapid telescope nods, only 
northern sky fields serviced the northern disk, and similarly 
for the southern fields; the maximum offset on the sky was 3° 
(see Fig. 3). This non-repetitive sampling of sky fields yielded 
two possible advantages: 1) when a median sky image is con- 
structed, many sky shapes are combined, possibly yielding an 
intrinsically flatter image of sky (see §5), and 2) if there is a 
coherent structure in the NIR sky, sampling sky fields degrees 
apart in rapid succession should average out these systematic 
biases in estimating the sky level on the disk. 

3. IMAGE PREPARATION 

The crux of this paper is finding how wide-field NIR mo- 
saics can best be made with the WIRCam instrument on 
CFHT, and determining what limits the accuracy of our sur- 
face brightness maps. To begin, we note that WIRCam data 
are offered by CFHT in three progressive stages of prepro- 
cessing by their Tiwi 1.0 pipeline to allow programmes, such 
as this one, to re-implement calibration recipes for potentially 
higher surface brightness accuracy. The data flavours are: a 
raw image that is essentially untouched after leaving the in- 
strument (*o . f its); an image that has been corrected for 
nonlinearity, dark subtracted and flat fielded (*s . f its); and 
an image that has been sky subtracted, in addition to all the 
previous treatments (*p.fits). 

As sky subtraction is the highest source of error in our pro- 
gram, the middle data product, * s . f it s, would appear most 
amenable as a starting point. Nonetheless, two Tiwi 1.0 pro- 
cessing stages included in *s . f its products must be han- 
dled carefully. 

Cross-talk correction — WIRCam integrations prior to March 
2008 (includes the 2007B data set, not the 2009B data) suf- 
fered from electronic cross talk within the detector. This cross 
talk is manifested in repeating rings above and below satu- 
rated stars. By default, the Tiwi 1.0 pipeline removes this 
cross talk by subtracting a median of the 32 amplifier slices. 
Unfortunately, this algorithm fails in cases where the back- 
ground has a surface brightness gradient (such as on the disk 
of M31) and produces an brightness gradient that is stronger 
than the galaxy surface brightness itself. Loic Albert (then at 
CFHT) kindly re-processed our 2007B data set with the cross- 
talk correction turned off. 

Flat fielding — We discovered that the dome flat fielding of- 
fered by Tiwi 1.0 was only accurate to 2% of edge-to-edge 
intensity. The ineffectiveness of WIRCam dome flat fielding 
is evident in *s . f its images that show significant detec- 
tor structure, despite having been flat fielded. An example 
of this structure is related to the different gain structures of 

2 See http://cfht.hawaii.edu/Instruments/Imaging/ 
WIRCam/WIRCamCrosstalks .html. 
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Figure 5. Comparison of a WIRCam frame cutout processed with dome flats 
by the Tiwi 1.0 pipeline (top), and with sky flats (bottom). Both images are 
shown in linear counts with identical level ranges. No median sky subtraction 
has been applied. Dome flats leave WIRCam images with dust artifacts (left) 
and detector surface defects (right). Furthermore, the 64-pixel high horizontal 
amplifier bands are clearly visible. Simply using sky flats eliminates these 
artifacts. 

the 32 amplifiers that service independent horizontal bands 
across each WIRCam detector (see Fig. 5) — a proper flat field 
capture such gain structure for proper photometry. Users of 
fully-processed Tiwi 1.0 *p. fits images do not directly 
notice the unsuitability of dome flats since images of the de- 
tector gain structure embedded in the NIR bright sky back- 
ground are subtracted from the signal as part of the median 
sky subtraction step. Yet since pixel gain changes the signal 
in proportion to incident flux, subtraction is fundamentally 
the wrong operation to use. Our remedy is to adopt night 
sky flats, which use the median night sky as the illumination 
reference rather than a dome lamp. The profound difference 
between dome or night sky flats is shown in Fig. 6. Our con- 
fidence in night sky flats as the correct choice for flat fielding 
WIRCam lies in their ability to remove both large scale illu- 
mination features (see §10) and pixel-to-pixel gain changes 
across WIRCam {e.g., Fig. 5). Production of WIRCam sky 
flats is subtle as we find the fiat field structure to be time de- 
pendent on sub-hour scales, and the NIR night sky itself is 
not fiat either. A comprehensive discussion of WIRCam flat 
fielding is presented in §10. 

Our ANDROIDS pipeline thus begins with *s . fits data 
that have been uncorrected for dome fiat fielding. That is, we 
multiply the *s . fits image with its associated dome fiat. 3 
The result is an image that retains Tiwi 7.0's prescription for 
dark subtraction, bad pixel masking and non-linearity correc- 
tion. The ANDROIDS WIRCam pipeline then follows the steps 
described below. 

3.1. Reduction outline 

Our ANDROIDS/WIRCam reductions begin by unify- 
ing the World Coordinate System across the dataset with 
SCAMP (Bertin 2006). Scamp matches stars in Source Ex- 
tractor (Bertin & Arnouts 1996) catalogs of each WIRCam 
*p . fits frame both internally (to cr m = 0"10), and against 
the 2MASS Point Source Catalog (Skmtskie et al. 2006) to a 
precision of tr re f = 0."15. By processing all 4286 frames in the 
ANDROlDS/WIRCam survey simultaneously, SCAMP allows 
an accurate and internally consistent coordinate frame for our 

3 Dome and twilight flats are made available by CFHT, http : //limu . 
cf ht . hawaii . edu : 80/ de trend/ wircam/. 
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Figure 6. Difference image between the 09BQ01 K s (queue run) sky flat 
and the domef lat_8302B_20090728HST143302_Ks dome flat used 
in the 'I'iwi 1.0 pipeline, in percent. Using the median night sky illumination 
over a WIRCam queue run, rather than a dome lamp, results not only in > 2% 
edge-to-edge difference in the large scale WIRCam illumination function, but 
also in different characterizations of gain structure in the WIRCam detector 
amplifiers (the 32 horizontal bands in each of the four WIRCam detectors). 

mosaic. SCAMP handles this data volume gracefully provided 
we cull the input star catalogs for stars with S/N > 100, and 
by using the SAME_CRVAL astrometry assumption that the 
WIRCam focal plane geometry is stable. 

The next steps are flat fielding and median sky subtraction. 
For flat fielding, we apply flat field frames built from sky im- 
ages (sky flats), as opposed to the dome flats and twilight sky 
flats employed in the 'I'iwi 1.0 pipeline. As we explore thor- 
oughly in Section 4 and again in Section 10, sky flats are cru- 
cial for tracking flat field evolution in the WIRCam detector. 
Next, our pipeline performs preliminary sky subtraction us- 
ing median sky images, as described in Section 5. Since we 
apply a new flat field recipe, our pipeline also re-estimates 
photometric zeropoints by bootstrapping directly against the 
2MASS Point Source Catalog; this operation is described in 
Section 6. 

Finally, while the data have formally been flat fielded and 
photometrically calibrated, the sky level estimation provided 
by the sky-target nodding observing scheme is not accurate 
enough to build a seamless mosaic. Instead we model sky off- 
sets that enforce internal consistency in the sky levels of target 
frames. In Section 7 we discuss the methods for modelling 
these sky offsets and provide an analysis of their amplitudes 
in Section 8. 

4. SKY FLAT FIELDING 

Dome flats fail to properly calibrate WIRCam data, as evi- 
denced by Figs. 5 and 6. Sky flats are an appropriate alterna- 
tive, both because of the abundant sky background (any NIR 
imaging program can use its own images to build sky flats), 
and because sky flats more aptly trace detector illumination 
and gain structure. The former because skylight traces the 
same optical path as astronomical sources; the latter because 
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WIRCam's gain structure appears variable. Sky flats allow 
detector gain mapping in real-time. We return to the variabil- 
ity of WIRCam flat fields in Section 10. 

4.1. Sky Flat Designs 

Producing sky flats is as simple as median-combining in- 
tegrations of blank sky. The ANDROIDS sky-target nodding 
observing strategy provides an abundance of 'sky' images for 
this purpose. A fundamental decision is the definition of the 
window of sky integrations that are combined into a sky flat. 
Here we present two sky flat designs: labelled QRUN and 
FW100K. 

The first choice assumes that flat fields are stable over a 
queue run (a continuous installation period of WIRCam). In 
this case, all sky integrations taken during a queue run, and 
through a given filter, are combined into a QRUN sky flat. This 
choice is reasonable since dust and optical geometry should 
be stable during a continuous mounting period. Further, 
choosing a large pool of sky images ensures high S/N and 
helps to marginalize over the shape of the sky background. 
For the ANDROIDS program, QRUN skyflats are built from 25- 
637 sky integrations over several nights (see Table 2). 

An alternative design choice assumes that WIRCam's il- 
lumination function and detector gain structure is unstable 
even over short periods. Here the objective is to make many 
sky flats that reflect the real-time flat field function — we call 
these Real-Time Sky Flats. Our first real-time sky flat, la- 
belled FW100K, is designed such that the pool of sky images 
reaches cumulative sky levels of at least 100,000 ADU, or that 
the time span from first to last sky integration be no longer 
than two hours. Properties of these sky flats — the number of 
images required to build them, and the width of the windows 
they cover — are shown in Fig. 7. Given the 07B /-band ST 
nodding pattern, 15 sky integrations are accumulated in 50 
minute windows, whereas the more frequent nodding in the 
09B campaign shortened this window to 20 minutes (though 
as long as 50-90 minutes in dark sky conditions). The brighter 
K s sky calls for just 7-13 integrations in 07B, or 10-20 inte- 
grations in the 09B campaign. This number of K s sky samples 
was accumulated within 10-30 minutes in 07B, or 10-70 min- 
utes in 09B. 

4.2. Building Sky Flats 

Given the ensemble of sky integrations, our next task is to 
scale the intensity of each image. This scaling obeys three re- 
quirements: 1) each image frame in the median stack is at the 
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Figure 7. Characteristics of FW100K real-time sky flats, meeting criteria of 
at least 100,000 sky ADU, and spans of less than two hours. Distributions 
of 07B sky flats plotted as red outlined histograms, 09B sky flats are plotted 
as shaded histograms. The number of sky images required to build a sky flat 
(top), and the time span from first to last sky image depends on sky bright- 
ness and the sky-target nodding pattern (bottom). Real-time sky flats can 
consequently be refreshed in as little as 10 minutes, or span the order of an 
hour. 

same level, 2) each WIRCam detector has a unified zeropoint, 
and 3) the sky flat across the whole array is flux normalized. 4 
This scaling is determined by the median pixel level measured 
on each detector for each sky integration — let us denote these 
median levels as a,j for the ith sky image's level in detector 
j, j = 1,2,3,4. To avoid bias in the sky estimate, we mask 
any pixels that do not sample blank sky. Source Extractor 
(Bertin & Arnouts 1996) is used to define stars and back- 
ground galaxies (we use 'I'iwi 1.0 *p . fits images to detect 
and mask sources), while hand-drawn polygon regions cover 
diffraction spikes and the diffuse halos around bright galactic 
stars. These masks, along with the 'I'iwi 1.0 bad pixel mask, 
are combined with WeightWatcher (Marmo & Bertin 2008). 

From the ensemble of images produced by an indi- 
vidual WIRCam detector, we compute the median sky 
level: /3y = median(ai J ,a2j...a„j). Further, we also com- 
pute S, the median of all median detector levels: S = 
median(/3,). Then each the sky image is scaled by the fac- 
tor fij = median(/3 1 ,/?2,/33,/34)/(o;y5). Note that the fac- 
tor a~j normalizes each image to the same level for stack- 
ing, while the ratio (3j/S adjusts the level of each detec- 
tor according to detector-to-detector zeropoint offsets. In- 
deed, detector-to-detector zeropoint offsets can be measured 
as -2.51og 10 (/3,//3i), as shown in Fig. 8. WIRCam detector #1 
(North-West quadrant) is clearly the most sensitive, while the 
dispersion indicates the level of gain variability in WIRCam. 

4 It is also acceptable to establish chip-to-chip zeropoint offsets using dif- 
ferential 2MASS photometry, rather than from sky surface brightness. In §6. 1 
we establish the equivalence of the two methods. 
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Figure 8. Distribution of real-time sky flat scaling factors, measuring 
detector-to-detector zeropoint differences relative to detector #1 (red: #2, 
green: #3, blue: #4) in / and K s bands. 

We also note systematic differences in spectral response of 
WIRCam detectors in the J and K s of order 5%. 

The flat itself is built by median combination. Median 
combination of a stack of hundreds of 2048 x 2048 pixel im- 
ages, each with a weightmap masking astronomical sources, 
is computationally intensive. A convenient solution is to use 
Swarp (an image-mosaicing software package, Bertin et al. 
2002) in a mode that combines images pixel-to-pixel. 

5. MEDIAN SKY IMAGE SUBTRACTION 

Since M31 is much larger than individual WIRCam fields, 
sky background is subtracted (to first order) using the sky lev- 
els found in contemporary sky images. Section 2 described 
the sky-target nodding sequences chosen for the 2007B and 
2009B observing campaigns. Although a scalar sky level can 
be estimated from a sky image, and subtracted from the paired 
target images, it is common to construct a median sky image, 
the same size as the WIRCam frames, and subtract this 2D 
image from target images. 

Independent median sky images for each WIRCam detec- 
tor are produced by choosing a sky image (the primary sky 
image) and four other sky images taken at adjacent times. 
Across each image, the median sky intensity is recorded. A 
Source Extractor object mask, as used in §4.2 for flat field- 
ing, removes bias from astrophysical sources. Each sky im- 
age is additively scaled to a common intensity level, allow- 
ing differences in overall sky amplitude to be ignored by the 
median combination. As described in §4.2, Swarp is used to 
median-combine the sky images, taking into account non-sky 
pixel masks. Since these median sky images record only low- 
frequency spatial information, these median sky images are 
smoothed with a Gaussian kernel (note this is quite different 
from the function of median sky images applied to dome-flat 
processed WIRCam data, where median sky subtraction also 
removed pixel-to-pixel artifacts). This median sky image is 
then additively scaled back to the original level of the primary 
sky image. 

Each science image is sky subtracted by first identifying 
the median sky frame whose primary image was taken most 
closely in time. That paired median sky frame is subtracted 
from the image. 
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In §10 we follow up on this operational discussion to con- 
sider the shapes of median sky images. In particular, we con- 
nect the role of median sky frames to the stability of different 
recipes of sky flats described in §4. 

6. PHOTOMETRIC CALIBRATION 

Since the sky flats used by this ANDROIDS AVIRCam data 
set differ from the dome flats employed by the 'I'iwi 1.0 
pipeline by 2% in edge-to-edge level (see Fig. 6), new pho- 
tometric zeropoints must be established. The Two Micron All 
Sky Survey (2MASS) Point Source Catalog (PSC) (Skrutskie 
et al. 2006) provides a convenient photometric system to boot- 
strap against. WIRCam pointings (20' x 20') in this survey 
contain typically ~ 500 2MASS stars. Although these are 
not standards, the ensemble of 2MASS stars may be treated 
as such. Since the disk of M31 is crowded, and 2MASS 
has poor resolution (1" per pixel), we consider 2MASS point 
source measurements to be unreliable on the disk. Instead, all 
zeropoints are measured against sky images, and those cali- 
brations are applied to paired disk images (analogous to the 
median sky subtraction procedure, described in Section 5). 

Photometry of 2MASS stars in the uncrowded sky fields is 
obtained with Source Extractor (Bertin & Arnouts 1996). We 
use the AUTO photometry mode to capture the full stellar light 
without using aperture corrections. Objects in the 2MASS 
PSC are matched to Source Extractor detections by position 
using J. Sick's Mo' Astro 5 Python package, that manages the 
full 2MASS PSC in a MongoDB database. The 2MASS PSC 
contains many galaxies, and many 2MASS sources are satu- 
rated in our deeper WIRCam images. Thus we select stars 
with mj < 14 or < 15 magnitudes, and stars with FWHM 
< 1". Additionally, we select sources with J—K s < 0.8 (typ- 
ical of foreground Milky Way stars) as we observe larger ze- 
ropoint residuals in redder stars. After filtering, typically 200 
matched 2MASS sources remain in typical WIRCam images. 

To fit a photometric zeropoint, we bootstrap directly against 
2MASS photometry visible in sky images. This practice obvi- 
ates any airmass dependence, since each fitted zeropoint, mo is 
instantaneous. Given sample of matching 2MASS and instru- 
mental photometry, the instrumental zeropoint is estimated as 
median: 



m Q = (w2MASS+2.51ogj (ADU/r ex p)-A(y-X s ) 2 MASs)- (1) 

The A term permits a linear colour transformation between 
the 2MASS and WIRCam bandpasses. Although we could 
fit a colour transformation coefficient A for each image, the 
short J—K s colour baseline makes such measurements unreli- 
able. Instead we adopt A } = 0.05 and A Ks = -0.05 (K. Thanju- 
var, priv. comm.) based on modelling stellar spectra with 
the 2MASS and CFHT/WIRCam transmission functions. For 
typical M31 RGB stars with J — K s ~ 1, this colour transfor- 
mation is a 0. 1 mag effect. 

Given that 2MASS stars in each image have photometric 
uncertainties 0.05 < CT2mass mag ;$ 03, the typical statistical 
zeropoint uncertainty, a„, , is 0.1 mag in a single image. Since 
mo is fit for sky images, zeropoints for science images are 
interpolated as the median of a sliding window of sky images 
large enough for the statistical uncertainty of the median mo 
to be reduced below 0.01 mag. 

6.1. Detector-to-Detector zeropoint consistency 

5 http : //moastro . jonathansick . ca 
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Figure 9. Distribution of mean detector-to-detetector zeropoint offsets for 
sky images processed by real-time sky flats. Zeropoint offsets between de- 
tectors #1— #2, — #3, and — #4 are plotted as red, green and blue histograms, 
respectively, for the /-band (top) and A'j-band (bottom). 
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Figure 10. Median sky subtracted WIRCam J (left) and K s (right) mosaics 
ofM31. 

Our sky flats are designed to unify the zeropoints of the four 
WIRCam detectors by scaling according to the modal sky lev- 
els seen on each detector (see §4). The accuracy of this cal- 
ibration can be verified by comparing photometric zeropoint 
estimates for individual WIRCam detectors. Figure 9 shows 
the distribution of mean detector-to-detector zeropoints off- 
sets observed in images processed by real-time sky flats. We 
find that zeropoints are consistent within ±0.1 mag, although 
we detect a possible systematic bias between detectors #1 and 
#4 at the level of 0.03 mag. 

7. SKY OFFSET OPTIMIZATION 

Despite our attention thus far to real-time sky flat fielding, 
and median sky image subtraction, we have yet to recover the 
true NIR surface brightness of M31. In Fig. 10, we plot mo- 
saics (assembled using Swarp) from ANDROIDS frames pro- 
cessed with the pipeline discussed in §3— §6. Although these 
basic image preparations dramatically improve image qual- 
ity compared to the 'I'iwi 1.0 pipeline (particularly with re- 
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gards to flat fielding performance), classical NIR sky subtrac- 
tion still fundamentally limits accuracy. Classically, the true 
value of the sky on M31's disk is lost by the temporal and 
spatial variations of skyglow between disk and sky field ob- 
servations. 

We now demonstrate how the residual sky bias in each ob- 
servation can be inferred from information in the overlaps of 
image pairs in the mosaic. Each classically sky subtracted 
image of the M31 disk is a combination of the true surface 
intensity, and a residual sky intensity, e,. Consider a pair 
of images, i and j, that overlaps the galaxy: their difference 
is (/,■ + e,) - (Ij + cj) = £,■ - ej. Given this measurement e,- - ej 
of residual sky intensity, we introduce sky offsets, A, for each 
observation so that 

(/,■ + £,- A,-) -(/; + e,- A,-) ^0, (2) 

and the intrinsic intensities cancel, = 0. Given a single 
pair of images, the inference of A, and Aj is degenerate given 
the single difference image, e, -e ; . In a mosaic, each image 
is coupled with many other images, and the mosaic itself can 
be considered as a network of coupled images. Thus by opti- 
mizing a set of sky offsets for all images simultaneously that 
minimizes all image-to-image differences in a mosaic, a ro- 
bust of set of A,- can be established. 

7.1. Implementations 

The sky offset optimization summarized above is difficult 
to implement because it is over-constrained, requiring a non- 
linear optimization. Any single image shares a network with 
others, and any error in the surface brightness shapes of fields 
will result in an imperfect match. In this work, we consider 
two implementations of sky offset optimization. First, we re- 
view the Montage software package in §7.1.1, and then intro- 
duce an alternative in §7.1.2. 

7.1.1. Montage 

Montage is a FITS mosaicing package (Berriman et al. 
2008) originally written for the 2MASS survey that includes 
sky offset estimation (background rectification, in their termi- 
nology) functionality. Images are rectified onto a mosaic pixel 
grid, and difference images are computed among overlapping 
images. Sky offsets are then chosen iteratively by looping 
through each image pair and choosing the offset needed to 
minimize the difference image of that pair, counting previous 
sky offset estimates. That is, at each step the sky levels of 
the two images are increased and decreased by half the total 
intensity difference. Sky offsets are refined over several loops 
through the entire set of overlapping image pairs until conver- 
gence is reached (that is, once incremental adjustments to sky 
offsets diminish below user-specified threshold). Although 
this iterative implementation of sky offset optimization is ele- 
gant, its accuracy has never been formally analyzed in litera- 
ture, to our knowledge. In particular, we are interested in the 
robustness of Montage sky offsets against local minima in the 
Af-dimensional solution space of sky offsets, given a mosaic 
of N independent images. 

7.1.2. A New Implementation based on the Multi-Start 
Reconverging Downhill Simplex 

As an alternative to the iterative algorithm used by Mon- 
tage, we introduce here an implementation based on the 
Downhill Simplex algorithm (Nelder & Mead 1965, here- 
after, NM). Like Montage, our pipeline begins with a set of 



images resampled to a common pixel grid in an Aitoff pro- 
jection with the native WIRCam pixel scale (0"3 per pixel). 
For this we use Swarp in resample-only mode. We identify 
overlaps between images in a brute-force fashion according to 
their frames in the mosaic pixel space, defined by the CRP IX, 
NAXIS1 and NAXIS2 header values of the resampled im- 
ages. 

For each overlapping image pair, we compute a difference 
image, and ultimately a median difference, (Ij-Ij). While 
computing the median difference, we mask bad pixels using 
weight maps (propagated by Swarp) and expand this mask 
with sigma clipping. Along with a difference estimate, we 
also record the area Ay of unmasked pixels in the overlap, 
and the standard deviation of the difference, cry. 

Let us define the objective function that encapsulates the 
effect of scalar sky offsets on reducing the intensity difference 
between coupled images: 

T(A u ...,An) = J2 W u( & A <' + A >) 2 . ( 3 ) 

•J 

which we intend to minimize by finding the optimal set of 
scalar sky offsets A,- for each of the detector fields i. Note that 
each coupled image pair is its own term in the objective sum- 
mation, and that there are as many degrees of freedom (A,) as 
there are images in the mosaic. Each coupling is tempered by 
a weighting term Wy: 

An 

Wij = —, (4) 

so that more priority is given to couplings of larger areas (Ay), 
and small standard deviations of their difference images (cry). 

Note that the objective function in Eq. 3 puts no constraint 
on the net sky offset: A i- Assuming that sky subtraction 
errors are normally distributed, and not biased, sky subtrac- 
tion offsets should not add a net amount of flux to the mosaic. 
Fortunately, it is possible to impose this constraint post facto 
by subtracting the mean offset from the sky offsets: 

n 

A?=A i -«- 1 ^A i . (5) 

In the limit that sky offsets A, are drawn from a Gaussian dis- 
tribution, with standard deviation a a, the absolute brightness 
of the whole mosaic will be uncertain by cta / y / Mmages • The 
consequences of this uncertainty are revisited in §9. 

Given the image coupling records, we optimize the set of 
A, using the NM downhill simplex. This NM algorithm is 
naturally multi-dimensional and does not require knowledge 
of the gradient of the objective function. Instead, the NM al- 
gorithm operates by constructing a geometric simplex of N+ 1 
dimensions that samples the sky offset parameter space. By 
evaluating the objective function at each vertex of the simplex, 
the NM algorithm adapts the simplex shape to ultimately con- 
tract upon a minimum. 

However, NM has two weaknesses. First, it is a greedy op- 
timizer that will converge into any local minimum, without 
necessarily seeking the global minimum. Second, for high- 
dimension optimizations (many fields in the mosaic), the NM 
may fail to converge in a reasonable number of objective func- 
tion evaluations (Neumann & Han 2006). Without solving 
these issues, a minimization of Eq. 3 with an off-the-shelf 
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NM code yields a mosaic with obvious discontinuities across 
fields. To solve the first problem, we develop a Multi-Start 
Reconverging NM (MSRNM) downhill simplex driver. This 
algorithm, outlined below, allows the NM to cumulatively 
cover a larger portion of parameter space to probabilistically 
ensure convergence into a global optimum. 

Multi-start — To cover a large portion of parameter space, and 
protect against starting near a false minimum, we start several 
independent simplex runs from random points in parameter 
space. We find that N s = 50, and possibly fewer, starts are 
quite sufficient for an optimization with 39 sky offset param- 
eters (such as the fitting of Ag block offsets in mosaic, see 
§7.2). For each start, an initial simplex is generated randomly. 
Since each point in the by + 1 simplex is a suggested sky 
offset for a given field, each offset is randomly sampled from 
the expected distribution of image-to-image sky offsets. That 
is, a normal distribution of mean zero intensity, and standard 
deviation of <7 sta rt- To ensure that the parameter space is well 
covered, we design cr staIt to be 3 x greater than the dispersion 
of image-to-image differences. 

Each initial simplex is allowed to converge according to the 
NM algorithm. A simplex is deemed to have converged once 
each point has changed by no more than 10~ 6 of the previous 
iteration. 

Restart — As suggested in Press (2007), we restart the con- 
verged simplex to ensure reconvergence and protect against 
false minima. Upon each convergence, the optimal point in 
the simplex, p, is recorded. A new simplex is then gener- 
ated where one vertex is p, and the rest are p+S where S is a 
normal random variable of mean zero, and standard deviation 
frestart- That is, the simplex of the restart retains one vertex 
upon the previously found minimum, while the other vertices 
surround that minimum. If the minimum is indeed the global 
minimum, then the NM algorithm will quickly reconverge. 
More often than not, the other random vertices will probe an 
even better parameter space, causing the NM restart to con- 
verge upon a new minimum. When a new minimum is found, 
the process of restarting is repeated until the same minimum is 
consecutively arrived upon. Our sky offset optimizations for 
39 blocks typically require ~ 1000 restarts before converg- 
ing definitively. Given the large number of restarts, we find 
that this reconvergence feature is extremely effective at elud- 
ing false minima in our optimizations. We set a lestalt to 2 x the 
dispersion of image-to-image differences. Like <7 s tart> there is 
flexibility in choosing a restart ; generally <j start should be on the 
order of the observed image-to-image difference dispersion, 
not necessarily as large as <7 sta rt. 

Note that each simplex start and series of subsequent 
restarts can be performed in parallel. Once all simplex runs 
are complete, the set of sky offsets belonging to the run that 
yielded the smallest value of the objective function is adopted. 

7.2. Hierarchical sky offset optimization 

Sky offset optimization is challenging because of dimen- 
sionality. Each image frame, and the associated scalar sky 
offset A,-, represents an additional dimension in the opti- 
mization. Considering that the WIRCam array produces four 
image frames for every exposure, the combined 2007B and 
2009B data sets consist of 3924 J and 4972 K s image frames 
covering the disk of M31. Such a large optimization is com- 
putationally ambitious, but also unnecessary. Our sky opti- 
mization algorithm breaks the optimization of sky offsets into 
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Figure 11. Scalar-sky fitted WIRCam J (left) and K s (right) mosaics of 
M31. Note the qualitative improvement compared to the original, median 
sky-subtracted, images in Fig. 10. 

three sequential steps, which we call hierarchical sky offset 
optimization. 

The 2007B and 2009B WIRCam surveys include a total of 
39 fields across the M31 disk (illustrated in Fig. 1). Each 
WIRCam field is imaged with four detectors, arranged in a 
2x2 grid. Let us define a detector field as the collection of 
images taken with a given detector, at a given field. Images 
in a detector field all have the greatly simplifying property of 
overlapping across a common section of the image frame. 

Thus our M3 1 WIRCam mosaics are assembled by apply- 
ing sky offsets in three levels of hierarchy. In the first level 
we combine the frames in a detector field to produce a de- 
tector field stack; these offsets are labelled as A F . The com- 
bined 2007B and 2009B surveys have 156 such stacks per 
filter. In the second level, the four detector field stacks within 
each field can be fitted into a block using offsets labelled as 
As. A block is a fundamental unit of the mosaic, as all im- 
ages that are combined within a block were observed under 
contemporaneous sky conditions. Finally, in the third level, 
the 39 blocks can be fitted into a galaxy-wide mosaic for each 
filter using offsets labelled as Ag- The net scalar sky offset 
applied to each frame is thus As = Af + A$ + A^. In both the 
second and third levels, we use the MSRNM scheme to opti- 
mize the set of A s and A B offsets, respectively. Note that for 
detector field stacks it is sufficient to simply compute a mean 
surface brightness across all frames, and directly compute off- 
sets (Af) between between the levels of each frame and the 
mean level. 

8. ANALYSIS OF SCALAR SKY OFFSETS 

The fruits of our WIRCam pipeline and sky offset optimiza- 
tion are presented in Fig. 11. Compared to our mosaics with- 
out sky offsets, Fig. 10, the sky offset optimization is clearly 
essential for assembling wide-field NIR mosaics. Note that 
these mosaics are not yet perfect; field-to-field discontinuities 
at a level of 0.05% of sky remain, and large-scale sky residu- 
als perturb the outer M3 1 disk. 

8.1. Amplitudes of Sky Offsets 

The distribution of scalar sky offsets provides an excel- 
lent characterization of sky subtraction uncertainties when us- 
ing sky-target nodding. Recall that sky offsets are optimized 
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Table 3 

Hierarchy of scalar sky offsets (using FW1 OK RT flat fielding, and median 
sky subtraction). The 'Total' sky offsets track the net offset of individual 

WIRCam image frames into the fitted mosaic. (/ s iy } is taken as the 
instantaneous sky level for the images being sampled (see Fig. 4 for the 
distribution of levels). Offset distributions are also presented in units of the 
WIRCam mosaics, DN, corresponding to a zeropoint of 25 mag. 









K s 


Offset Type 


Sem. 






A F 


07B 
09B 


2.53 
1.91 


2.29 
1.88 


A s 


07B 
09B 


0.11 
0.11 


0.05 
0.06 


A B 


07B 


1.24 


0.94 


09B 


0.66 


1.13 


A E 


07B 
09B 


2.73 
1.98 


2.44 
1.96 



hierarchically: WIRCam frames are fitted to stacks, stacks 
are fitted into blocks of four contemporaneously-observed 
WIRCam detector fields, and these blocks are fitted into a 
mosaic. Table 3 lists the standard deviations of these offset 
distributions with respect to the typical sky level observed in 
the J and K s bands. 

Note that the sky offsets, as a percentage of sky level, are 
comparable in the J and K s bands, despite the skyglow being 
~ 4x brighter in K s than J (see Fig. 4). This indicates that 
spatio-temporal variations in the NIR sky are monochromatic. 

Within the hierarchy of sky fitting, simply fitting frames to 
a stack (with Af) is most significant: a correction on the order 
of 2% of the sky intensity. Fitting blocks into a mosaic (A#) 
is a further ~ 1 % correction. Overall, the temporal and spatial 
lags of sky-target nodding induce a 2% uncertainty in the sky 
level at the target. It is this level of uncertainty that sky off- 
set optimization must diminish to transform uncorrected mo- 
saics (Fig. 10) into ones that reproduce the disk with fidelity 
(Fig. 11). 

Note that offsets to fit a stack into a block (A s ) of four de- 
tector field stacks are smallest: 0.1% of the sky level. This 
suggests that on the scale of the 2x2 WIRCam array, the 
contemporaneously observed detector frames are subjected 
to nearly identical biases in sky background. Stack offsets, 
then, arise from uncertainties in the pipeline's measurement 
of the sky level from single frames in two stages: estimating 
detector-to-detector zeropoint offsets from frame sky levels 
(§4.2) and again when subtracting a median sky frame (§5). 
Indeed, in §10 we show that median sky images have shape 
amplitudes of 0.3% of the sky level and that individiual frames 
have surface brightness shapes that are uncertainty at a level 
of 0.2%; A s sky offsets are thus a consequence of the limited 
surface brightness flattness across a WIRCam frame. 

A comparison between the net sky offsets applied to the 
2007B and 2009B data sets is provocative. Although the 
2009B dataset employed rapid sky-target nodding to mini- 
mize temporal lags between disk and sky sampling, the mag- 
nitude of sky offsets in the 2007B and 2009B semesters is 
comparable. This implies a limit to the absolute sky level ac- 
curacy that can be expected: the minimal 40 sec lag between 
sky and target samples, combined with a 1-2° nod across the 
sky, allows the sky level to change by 2%. Reducing this la- 
tency, and this nodding distance, is impossible in WIRCam 
observations of M3 1 . Thus, by the metric of Table 3, the ex- 
pensive 2009B observing approach did not pay off. 
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Figure 12. Acceptability of J and K s scalar sky offsets between blocks, as 
measured by the ratio of Ag/cr^ F , plotted as histograms and field maps. 
Shaded blue and red-outlined histograms distinguish blocks observed in 
2007B and 2009B, respectively. Scalar sky offsets required for blocks are 
consistent with the sky level uncertainties of single frames, given sky-target 
nodding sky subtraction. 

8.2. Acceptability of Sky Offsets 

Recall that scalar sky offsets were initially introduced as 
intensity increments to overcome uncertainty in the sky level 
of detector field stacks. For sky offsets to be considered ac- 
ceptable, we demand that the offsets applied to blocks, be 
consistent with the sky level uncertainty of the blocks them- 
selves. We can conservatively measure the sky uncertainty 
as the dispersion of Af frame offsets in a stack: cr&. r . If sky 
offsets fitted between blocks are statistically permissible, then 
Ab < <TAp ■ In Fig- 12, we plot field maps (in the same spatial 
configuration as Fig. 1) painted with the values of Ab/<ja f 
for each block in the J and K s mosaics. The sky offsets are in- 
deed distributed within the uncertainty budgeted by cta f : the 
sky offsets are statistically acceptable. 

Another demonstration of the veracity of these sky offsets 
is given in Fig. 13, where we plot a time series of both directly 
measured sky levels, and sky levels interpolated on disk ob- 
servations via sky offsets. Note the remarkable continuities of 
the sky level from sky-to-disk observation; compared to the 
sky level time series of stationary sky fields, the variations in 
the sky level interpolated on the disk appear real. Through 
the sky target nodding and sky offset optimization, we have 
effectively measured the sky level on M31. The variability of 
the sky in these time series will be further examined in §8.4. 

8.3. Residual Image Level Differences 

Although scalar sky offsets are statistically valid, they are 
not perfect prescriptions against the sky subtraction uncer- 
tainties of each image stack — that much is visually true. A 
measure of the limitation of sky offset fitting are the residual 
image level differences between coupled blocks, (7,— Ag,) - 
(Ij-Abj), after the sky offsets Abj have been optimally fitted 
to each block. 

Table 4 lists distributions of both image level differences 
between coupled blocks, before and after the application of 
scalar sky offsets. Uncorrected, the ensemble of coupled 
blocks have a mean intensity difference of ~ 1 % of the typi- 
cal sky background intensity. Scalar sky offsets decrease the 



12 



Sick et al. 



^ 30 » m 

< pu *. " .v 

> [54345] * * % « * ••••»:•••*.•.. 

10 ■ ■ ■ ....... 



o 
I 

10 



< 
+ 



o 
10 

I 

in 



50 
40 
30 
20 
10 


-10 
50 

40 

30 

20 

10 



-10 



55132 



54318 



• • • 

''•^ • • •••• 



srrr: 

••••••• •••» •••• 



• ••• 



1 55051] •••••••••• 



55136 



••••• M 



[54396] • * 



54366 



••••••• 



•••• 



*■•• ■■ - 

••••••*..•<*•.•.. jv«KwVrwi«>*«* ••••• 

ft .Tfr..^ „ • 

••••••••••••• 



10 



20 



30 

AT (minutes) 



40 



50 



60 



Figure 13. Sky level time series for selected nights in the J (top) and K s bands (bottom) as a percent difference of each night's initial sky level (with a 10% offset 
between nights). Individual nights are tagged by a modified Julian date (MJD). Directly measured sky levels are plotted in black, while sky levels estimated on 
the disk (with scalar sky offsets) are coloured dots. The continuity between real and estimated sky levels demonstrates the validity of sky offsets. 



differences between overlapping fields to ~ 0.2%. 

Fig. 14 shows the block-to-block residual differences as a 
fraction of the local surface brightness. Note that through- 
out the bright inner disk of M31, block-to-block residuals are 
negligible compared to the disk signal, at the mosaic periph- 
ery (R ~ 20 kpc), field-to-field residuals become comparable 
to, or greater than, the disk surface brightness. The poor fit is 
driven primarily by diminishing disk signal, rather than poor 
convergence of sky offsets. This can be seen by plotting the 
magnitude of block-to-block residuals (in units of sky bright- 
ness) in Fig. 15. There, significant residuals are distributed 
throughout the disk, rather than the low-SB periphery of the 
mosaic. 

Of course, it is precisely this residual image level differ- 
ence whose minimization was sought as part of the optimiza- 
tion's objective function (Eq. 3). The inability of scalar sky 
offset optimization to eliminate residual image differences 
should not be interpreted as a failure to detect the global min- 
imum; the MSRNM optimization algorithm appears robust 



in yielding this offset solution set. Evidence of this can be 
seen in Fig. 16, where block-to-block network connections are 
coloured by the ratio of the residual block-to-block intensity 
difference to the uncertainty in the block-to-block difference 
image. The sky offsets solved by the MSRNM algorithm are 
within the uncertainties of the difference images themselves; 
better scalar sky offsets cannot be made with the WIRCam 
blocks that our pipeline has produced. Nonetheless, block- 
to-block surface brightness discontinuities are still evident in 
Fig. 11. 

An interpretation of this predicament is that sky background 
residuals are not entirely scalar across WIRCam blocks. Our 
discussion in §10 indicates that flat field variability, and sky 
background variability, can affect the shape of WIRCam 
fields. 

8.4. The Growth of Sky Offsets in Time and Space 

How does our sky-target nodding program affect the mag- 
nitude of the sky offsets? This was partially addressed in 
§8.1, where both the 2007B and 2009B observing programs 
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Table 4 

Coupled block intensity differences and residual intensity differences after 

application of scalar sky offsets: 25th, 50th and 75th percentiles of 
distribution. Differences are presented as a percent of the mean sky level 
seen by observations in each band. 

Coupled Block </, -/;}/(/ sky > (%) 





25th 


50th 


75th 


/, uncorrected 


0.47 


0.91 


1.70 


J, scalar offset 


0.05 


0.09 


0.18 


K s , uncorrected 


0.42 


0.90 


1.43 


K s , scalar offset 


0.02 


0.05 


0.08 
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Figure 14. Map of residual block-to-block surface brightness differences as a 
fraction of mean local surface brightness, after scalar fitting. This graph mim- 
ics the spatial distribution of the 2007B and 2009B WIRCam fields (Fig. 1), 
with the footprints have been exploded to allow room for lines to connect 
coupled blocks. 
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Figure 15. Map of residual block-to-block surface brightness differences as 
a fraction of the mean sky level, after scalar fitting. 

resulted in similar net offset distributions (Table 3). In this 
section, we directly map the observed sky offsets as a func- 
tion of time latency relative to the sky observation. 

First, we must realize that sky offsets are born not only from 
temporal variability in the sky level (e.g., Fig. 13), but also 
from spatial structure in the NIR skyglow. Thus as a fiducial, 
we first build a function of mean sky level variation as a func- 
tion of time measured at stationary site on the sky (a single 
sky field), without telescope nodding. In Fig. 17 we plot the 




iogiof^residual/Oi/) logioC^ residual/^y) 

Figure 16. Map of residual block-to-block surface brightness differences as 
a fraction of the standard deviation of the difference image, after scalar fitting. 

mean and 95% growth of sky level variations as a function of 
time. In agreement with Vaduvescu & McCall (2004), we see 
a mean sky level variation of ~ 0.5% in 1 minute in both J 
and K s bands. After 5 minutes, the intrinsic sky level varia- 
tion typically grows to 2%. At worst, we see a sky variation 
(measured at the 95% level of the sample distribution) of 5% 
in 5 minutes. 

Individual points in Fig. 17 are net sky offsets of disk im- 
ages plotted against the time latency to the paired sky sample. 
The periodic time structure in Fig. 17 is a consequence of the 
2007B and 2009B sky-target nodding schemes (see Table 1); 
indeed the circles and 'x' marks in Fig. 17 denote the mean 
and 95% level of sky variation, respectively, in each cluster. 
Recall that all 2009B disk integrations have equal sky sample 
latency due to the sky-target-target-sky nodding pattern. 

In both J and K s bands, we see that nodding the telescope 
between sky and target generates additional sky level uncer- 
tainty beyond that expected from strictly temporal sky evolu- 
tion. This makes sense in the context of spatial sky variations 
(Adams & Skrutskie 1996). As shown in Fig. 17, the process 
of sky-target nodding can inflate sky variations by 1 .5-2 times 
the sky variability expected at a stationary site on the sky. On 
longer time scales, the nodding and stationary sky variance 
converge, perhaps indicative of the timescales that NIR sky- 
glow structures move across a nodding distance (l°-2° on the 
sky). 

This analysis underscores the challenge of accurately re- 
covering surface brightness in a wide-field NIR mosaic. Sky- 
target nodding with CFHT implies typical time latencies of 
60-70 seconds, and nodding distances of l°-2°. Both of these 
elements prevent the true level of the sky on M31's disk, in 
any single frame, from being known to an accuracy greater 
than 2%. 

Fig. 17 also shows that the 2009B observing strategy of 
minimizing sky nodding latency would maximize sky level 
certainty. The rather shallow slopes of the mean sky variance 
seen in sky-target nodding demonstrates the modest gain sky 
certainty by capping sky latency at 1.2 minutes (i.e., 2009B) 
compared to allowing latencies of 5 minutes (i.e., 2007B). 
This also better explains Table 3. 

9. SYSTEMATIC UNCERTAINTIES IN SURFACE BRIGHTNESS 
RECONSTRUCTION 

Sky offsets produce a mosaic that is rigorously optimal only 
in the sense of field-to-field surface brightness continuity — 
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Figure 17. Comparison of sky variation distribution functions measured as 
sky offset amplitudes and between pairs of sky images for J (top) and K s 
(bottom) bands. Mean and 95% levels of sky variation are plotted as solid and 
dashed black lines, respectively. Mean and 95% levels of sky offsets in the 
2007B semester are plotted as large blue circles and X symbols, respectively, 
while the same for the 2009B semester is plotted as red symbols where sky 
latency was constrained to 1.2 minutes in both bands. 




0.5 0.0 -0.5 

E, (degrees) 



0.5 0.0 -0.5 

E, (degrees) 



Figure 18. Surface brightness difference maps between our simplex scalar- 
fit mosaics (Fig. 11) and Montage scalar-fit mosaics. 

not absolute sky subtraction. In this section, we attempt to 
gauge the systematic surface brightness error inherent in the 
sky offset technique. 

9.1. Comparison to Montage-fitted images 

One means of testing for systematic errors in mosaic con- 
struction is to compare results from different methods. Be- 
sides the simplex method developed in §7. 1.2— §7.2 and ana- 
lyzed in detail in Section 8, we also tested the Montage code 
that uses an iterative algorithm to solve either scalar or planar 
sky offsets (see §7.1.1). Figure 18 shows the surface bright- 
ness difference between our simplex solution and the iterative 
Montage mosaic solution assuming scalar sky offsets. Despite 
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Figure 19. Montage-generated J and K s maps. Compare to the equivalent 
simplex maps, Fig. 11. Pixels with negative surface intensity, after sky offset 
correction, are masked with white. 

an identical dataset, the two methods yield systematically dif- 
ferences of up to ~ 0.5 mag arcsec" 2 at 20 kpc, though the 
solutions are consistent in their treatment of the inner disk. 
Although the simplex and Montage scalar-offset mosaics ap- 
pear equivalently valid to the eye, a unique and optimal sky 
offset solution either does not exist, or is extremely difficult 
for our optimization algorithms to find. 

Montage is also capable of fitting planar sky offsets to im- 
ages, which is a tempting solution to the field-to-field discon- 
tinuities that persist between scalar-offset blocks. The result 
of planar fitting is shown in Fig. 19. We see that planar sky 
offsets, in this case, do little to improve the mosaics, and in- 
deed, have a dramatic effect on the systematic surface bright- 
ness of the mosaic (by more than 1 mag arcsec" 2 in the K s 
band). Planar sky offsets may be amenable for observations 
acquired in long strips (such as 2MASS or SDSS), however, 
the small and square WIRCam fields do not provide the nec- 
essary leverage to prevent systematic error propagation from 
planar offsets. We thus recommend against using planar, or 
higher-order, sky offsets in wide-field WIRCam mosaics. 

9.2. Comparison to Spitzer/IRAC Images 

We also explore systematic uncertainties in our WIRCam 
mosaics with comparisons against well-calibrated images of 
M31. A template for the NIR disk is the 3.6 pm Spitzer/IRAC 
map, presented in Barmby et al. (2006). Note that although 
Spitzer data avoid background subtraction issues caused by 
the NIR sky, planar sky offsets were used by Barmby et al., 
though presumably of a smaller magnitude than our WIRCam 
sky offsets. In Fig. 20 we compare our simplex scalar- fitted 
mosaics against the 3.6 fim image. Generally the J- [3.6] and 
K s - [3.6] colors decrease with disk radius, but increase in the 
star-forming regions due to hot dust emission. However both 
colour maps (coincidentally) become redder in the south west- 
ern disk beyond the 10 kpc star forming ring. We interpret this 
as a systematic over-subtraction of sky in these regions on the 
order of > 1 mag arcsec" 2 . Evidently, our scalar sky offset 
mosaics are not systematically reliable beyond the bright disk 
of M31, toward R > 15 kpc. 

9.3. Monte Carlo Analysis of Systematic Surface Brightness 
Uncertainties 
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Figure 20. Maps of J -[3.6] and ^,-[3.6] surface colour inferred from 
the simplex scalar-sky fitted WIRCam mosaics and Spitzer/IRAC 3.6 
image (Barmby et al. 2006). Note that the IRAC map crops the AN- 
DROIDS/WIRCam footprint. 
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Figure 21. Mosaic maps of bootstrap RMS surface brightness in / (left) and 
K s (right). White contours identify RMS levels of 0.05 (solid), 0.1 (dashed) 
and 0.2 (dash-dot) mag arcsec -2 . 

The difference images presented in the previous section il- 
lustrate how the surface brightness reconstructions of identi- 
cal data can vary depending on the optimization algorithm. 
Here we pose a slightly different question: how are recon- 
structions affected by the initial conditions of sky errors? That 
is, given the possible sets of sky level biases affecting the 
blocks, what is the distribution of surface brightness recon- 
structions? We answer this with a realistic Monte Carlo anal- 
ysis. 

A Monte Carlo (MC) realization is generated by perturbing 
the surface brightness of the corrected blocks with a sky er- 
ror drawn (with replacement) from the ensemble of block sky 
offsets observed in the original mosaic (Fig. 11). Using the 
scalar-sky fitting procedure, sky offsets are optimized against 
the known sky perturbations; 100 such realizations are made 
to compile an ensemble of mosaics in both bands. 

Figure 21 shows the RMS deviation of MC mosaic sur- 
face brightness against the original scalar-fitted mosaics. Re- 
constructed surface brightness in the outer disk can vary by 
~ 1 mag arcsec -2 , consistent with colour biases in the /— [3. 6] 



and K s — [3.6] maps. 

We can ultimately understand the source of surface bright- 
ness by examining the standard deviations in the residual be- 
tween expected and realized sky offsets in each Monte Carlo 
iteration. This residual dispersion is 0.15% of the /-sky 
(0.17% of the K s sky); we find this dispersion to be constant 
across all fields in the mosaics. If mosaic surface brightness 
uncertainty is caused by flexure in the mosaic — where blocks 
on the mosaic periphery are forced to conform to the surface 
brightness of more central and tightly coupled blocks — then 
outer blocks would have higher offset dispersion. This is not 
the case. 

Rather than mosaic flexure, a better model for Fig. 21 in- 
volves uncertainties in the post priori adjustment for zero 
net offset (Eq. 5). Since block sky offsets have approxi- 
mately Gaussian distributions with dispersions given in Ta- 
ble 3, the uncertainty in the net offset correction is simply 
er(block) / -^/nbiocksi where «biocks = 39 in the combined 2007B 
and 2009B mosaic. Given that <7a b ~ 1 %, the expected uncer- 
tainty in the net offset correction is 0.16%: in perfect corre- 
spondence to the observed mosaic uncertainty. The dominant 
source of uncertainty shown in the MC simulations, Fig. 21, 
is the use of an arithmetic mean of offsets to set an absolute 
zeropoint, not flexure or uncertainty in the network of offsets. 
This suggests that external zeropoints could be very useful 
in replacing Eq. 5. Since no absolutely-calibrated NIR pho- 
tometry of M3 1 's surface brightness exists, we will discuss a 
method using panchromatic resolved stellar populations in a 
future work. 

10. ACCURACY OF SURFACE BRIGHTNESS SHAPES ACROSS 
WIRCAM FRAMES 

Our analysis of the WIRCam M3 1 mosaics has thus far fo- 
cused on macroscopic surface brightness accuracy. Here we 
examine the accuracy of surface brightness shapes in individ- 
ual WIRCam frames (and ultimately, blocks). Such shape 
biases give rise to discontinuities between blocks in our op- 
timized mosaics (Fig. 11), and reflect high-order irregulari- 
ties in block surface brightness shape that cannot be corrected 
with either scalar (zeroth-order) or even planar (first-order) 
sky offsets. 

Block shape accuracy is affected by two stages of our re- 
duction pipeline: first, in the proportional corrections of flat- 
fielding, and second, in the additive corrections of median sky 
subtraction. The accuracy and effectiveness of both calibra- 
tions is limited by spatial and temporal variations in the NIR 
sky itself (see discussion in §1). In this section, we attempt 
to deconvolve the scales of proportional and additive surface 
brightness biases, and ultimately establish an empirical upper 
limit on the edge-to-edge surface brightness accuracy seen in 
our WIRCam program. 

10.1. Evolution of Real Time Sky Flats 

In producing a NIR sky flat, we assert that the mean shape 
of the NIR sky over a timespan is flat. By maximizing the 
time window we can marginalize over as many sky shapes as 
possible to produce an unbiased skyflat. This notion is em- 
bodied in the QRUN sky flats that combine hundreds of sky 
shapes captured across several days. However, this also as- 
sumes that WIRCam is a stable detector over the period of 
several days; our real-time FW100K sky flats, however, as- 
sume that WIRCam is unstable over periods of 30-minutes. 

A simple test of WIRCam's flatfield stability is to moni- 
tor how the real-time FW10 0K skyflats evolve on scales of 
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hours. In Fig. 22 we show percent difference images between 
FW1 OK sky flats made at intervals of 15, 30, 60 and 90 min- 
utes after an initial sky flat. After just 30 minutes, the shape 
of the FW100K skyflats deviates by 0.5% from the initial flat 
field shape. By 60 minutes, the deviation exceeds 1%. Evi- 
dently, the WIRCam flat field function is stable on timescales 
less than 30 minutes — much less than a queue run. 

Nonetheless, the spatio-temporal evolution takes many 
forms. In some cases (Fig. 22b) the flat field deviations are 
axisymmetric, while in others there is a distinct East- West 
pattern (Fig. 22ac). We interpret these as instabilities in the 
WIRCam illumination function on the scale of minutes. 

An alternative interpretation is that these sky flat devia- 
tions are instabilities in the WIRCam detector electronics. 
The dominant macroscopic electronic feature in WIRCam flat 
fields are the amplifier bands. Each WIRCam detector is di- 
vided into 32 horizontal bands (each 64 pixels high) that are 
read out into independent amplifiers. These amplifiers have 
gains that result in levels that differ by 10% in flat field im- 
ages. Still, these different gains appear stable: over the course 
of an observing block, the mean flat field level of each am- 
plifier band evolved by less than 0.1% relative to other am- 
plifiers (see Fig. 23) over three hours. Indeed, the amplifier 
band signature is absent from the FW100K skyflat difference 
images (Fig. 22). 6 Thus sky flat evolution does appear driven 
by changes in the large scale detector illumination function, 
not electronic instabilities. 

10.2. Shapes of Median Sky Frames 

Another test of sky flats is their ability to produce an un- 
biased sky background, up to the level of intrinsic sky vari- 
ations. This test can be made by examining the median sky 
frames (§5) produced by QRUN and FW10 0K sky flats, as 
shown in Fig. 24. QRUN sky flats clearly produce biased sky 
shapes, evidently caused by changes in the CFHT/WIRCam 
optical path and electronics over a queue run, rather than in- 
trinsic variations in the sky itself. 

To quantitatively compare the bias of QRUN versus 
FW1 OK sky flats, we measure the amplitude of shapes across 
the 10' x 10' WIRCam frame as the 2-standard deviation in- 
terval (95%) of each median sky image's pixel distribution: 
2er(med sky). These distributions of sky shape amplitude are 
plotted in Fig. 25. As expected from Fig. 24, QRUN median 
sky frames have typical biases of 1% of the sky level, while 
sky frames from real-time FW1 OK sky flats are much flatter, 
< 0.3% of the sky level, and more consistently so. 

Although FWlOOk sky flats are better than QRUN flats, we 
can still wonder if the 0.3% flatness limit of median sky im- 
ages is a consequence of intrinsic sky shapes or due to limits 
in the WIRCam flat field accuracy. Recall that median sky im- 
ages are composed of five sky frames taken closest to a disk 
frame. In 2007B, all sky frames were sampled from the same 
coordinate on the sky, and span a 12 minute window cover- 
ing sky integrations taken before and after a disk image (for 
both J and K s sky-target nods). In 2009B, sky frames were 
sampled from randomly chosen sites along the sky field ring 
(Fig. 1) with a window typically spanning 15 minutes. Thus 
both 2007B and 2009B median sky images span similar time 
windows, although the 2009B strategy attempts to marginal- 
ize over five distinct sites on the sky (and thus sky shapes) 
while 2007B median sky images do not. If the spatial sky- 
glow pattern and WIRCam flatness function were stationary, 

6 Quite unlike Fig. 6 that compared dome flat and QRUN sky flat shapes. 



we expect the instantaneous shape of the sky (in the sky field) 
to be captured in the 2007B median sky frames, while the 
2009B sky frames should marginalize over five random sky 
shapes and thus be flatter by up to a factor of 1 /y5 (for a 
Gaussian shape distribution). That this does not occur indi- 
cates either that (a) sky shapes are correlated over 15 minutes 
and 3° of sky, (b) skyglow patterns have sufficient temporal 
variability to be effectively uncorrelated over 15 minute win- 
dows across a WIRCam frame so that both observing patterns 
marginalize over sky shapes equally, or (c) the flatness of me- 
dian sky images is limited at the 0.3% level by background 
variations associated with WIRCam itself over 15 minutes. 
Wide-field movies of the NIR sky (Adams & Skrutskie 1996) 
suggest option (a) to be false. From this test alone, then, 
we cannot distinguish between instrumental background vari- 
ability or stochasticity in the sky as the cause of the 0.3% 
sky shape amplitudes seen by WIRCam median sky images. 
Further, this test cannot distinguish between proportional in- 
strumental variability in the flat field {e.g., Fig. 22) or addi- 
tive background variability (as seen in the CFHT-IR camera, 
Vaduvescu & McCall 2004). 

10.3. Frame residuals shapes 

In previous sections, we have shown that real-time FW1 OK 
sky flats are preferred over broader-baseline QRUN sky flats 
since they produce systematically flatter median sky images, 
and do seem to track real evolution in the WIRCam flat field 
function on scales of 30 minutes. Yet it is difficult to measure 
the absolute accuracy of real-time sky flats since flat field bias 
cannot be distinguished from additive background stochastic- 
ity (from sky or instrument) when simply observing the shape 
of the sky background. If we examine signal- (not sky-) dom- 
inated fields, flat fielding biases should grow in comparison to 
sky shape biases. 

Our 2009B observations of field M31-37 in the K s band are 
ideal for this experiment: a single detector in that field cov- 
ers the core of M31, and observations were taken into two 
blocks, covering a total window of 2 hours (most blocks for 
this program are observed by the CFHT queue in a half hour). 
Both the high surface brightness and wide time baseline in 
this field should highlight flat field bias and variation. In 
Figures 26 and 27 we show the residual shapes of individ- 
ual WIRCam frames against the median shape of the mosaic, 
given FW100K and QRUN sky flattening, respectively, in field 
M31-37 in the K s band. That is, we produce difference images 
between each WIRCam frame and the block mosaic. To ana- 
lyze the shapes of these difference images we then marginal- 
ize the difference images along rows (left side of Fig. 26), 
and columns (right side of Fig. 26). Note that these marginal- 
ization are done for each detector in the 2x2 WIRCam ar- 
ray; the core of M31 resides in detector #2 (lower-right). In 
that high surface brightness region, there are strong surface 
brightness residuals that clearly point out flaws in the flat field 
itself. FW10 0K sky flats (Fig. 26) WIRCam frames at the 
core of M31 can vary in surface brightness by ±0.5%; with 
QRUN sky flats these residuals are much larger, nearly ±1% 
of the K s -bwA sky brightness. The time scale residual evo- 
lution clearly matches that of flat field evolution indicated in 
Fig. 22. Despite limited accuracy, the FW100K sky flats do 
appear to track the real-time evolution of the WIRCam detec- 
tor and produce more consistent frame shapes. Although all 
frames in Fig. 27 are processed with the same QRUN sky flat, 
the intrinsic flat field function of WIRCam has evolved over 
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Figure 22. Spatio-temporal evolution of FW1 OK sky flats, shown as percent difference of the flat fields 15, 30, 60 and 90 minutes after the initial sky flat of 
the night. Sky flat evolution on three nights is shown: (a) 2009B K s observing block with the telescope staring at a sky field, (b) 2009B K s observing block with 
regular sky-target nodding, and (c) a 2007B observing block with sky-target nodding. 



the span of two hours to create frame-to-frame shape varia- 
tions nearly ±1% of the ^-band sky brightness at the center 
ofM31. 

It is useful to contrast the frame shape residuals seen in de- 
tector #2 with those in other detectors, where the disk surface 
brightness is lower. There, both FW1 OK (Fig. 26) and QRUN 
(Fig. 27) show similar residual distributions, on the order of 
< 0.2% of the NIR sky brightness. Further, the results are not 
monotonically varying in time, as they are in detector #2, and 
indeed appear to vary essentially randomly. We interpret this 
frame residual behaviour as being caused by random additive 
background processes, distinct from flat field biases that are 
proportional to surface brightness. 

10.3.1. Distributions of frame shape residuals 

We now extend the previous analysis analysis across the 
entire dataset. In Fig. 28 we plot the distribution of frame- 
block residual shape amplitudes, measured at the 95% differ- 
ence interval. In essence, this measures the consistency of 
imaging the shape of each M3 1 block. As in our test of me- 
dian sky frame flatness (§10.2), we see that the consistency 
of frame shapes is ~ 0.3% of the sky level. This result is 
seen for both QRUN and FW100K sky flat pipelines and for 
2007B and 2009B observing schemes, agreeing with our ob- 
servation in §10.3 that in background dominated regimes (as 



most of our blocks are) frame shape consistency is not cor- 
related with flat field bias. Rather, we interpret Fig. 28 as 
measuring the amplitudes of additive stochastic background 
shapes originating either from the sky, or associated with the 
instrumentation itself. Effectively, Fig. 28 illustrates \he flat- 
ness limit of WIRCam frames observed with large sky-target 
nods, sky flat fielding, and median sky subtraction. 

10.4. Section Summary 

We summarize our findings on the accuracy of surface 
brightness shapes reproduced by WIRCam in a sky-target 
nodding observing program: 

1. Surface brightness perturbations can be decomposed 
into multiplicative processes (flat field biases) and ad- 
ditive processes (stochastic sky and instrumental back- 
grounds) by observing residual shapes of individual 
WIRCam frames again the median M31 mosaic in high 
and low surface brightness regimes; we find evidence 
for both processes occurring. 

2. The WIRCam flat field function can vary by approxi- 
mately 1% across a 10' x 10' frame in 1 hour. Building 
sky flats concurrently with observations is necessary 
to minimize systematic surface brightness bias. This, 
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Figure 23. Time evolution of the mean flat field level of amplifier bands in 
the four WIRCam detectors in FW1 OK sky flats made over three hours. Am- 
plifier bands are colour-coded according to their order on the array: red lines 
at the bottom, green in the middle, and blue at the top. Although the levels of 
individual amplifiers differ by 10%, their order is consistent, indicating that 
amplifier gain is extremely stable. The jitter is due to detector-to-detector 
zeropoint normalization (§4.2) uncertainty, or measurement biases. 
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Figure 24. Comparison of a typical median sky frames in 2 X 2 WIRCam 
array in the ^ s -band from images calibrated with QRUN (left) and FW100K 
(right) sky flats. Amplitudes of shapes in these median sky images are plotted 
as a fraction of the mean sky brightness. While the real-time sky flat produces 
very flat median sky images, QRUN-calibrated data produce sky biases on the 
order of 1% of the NIR sky level. 



however, is only a concern in signal-dominated pixels 
(such as galaxy centres, or point sources); otherwise 
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Figure 25. Cumulative distribution function of sky level amplitudes across 
median sky images processed with QRUN (black) and FW100K (blue) sky 
flats for the J (top) and K s (bottom) bands. Real-time (FWlOOk) sky flats 
produce much flatter median sky frames, with expected shape amplitudes of 
0.3% (7) to 0.1% (Ks) of the NIR sky level, while the expected amplitude of 
QRUN-flat processed sky images is 0.5% of the sky level, and as high as 3% 
of the sky level. 

median sky subtraction is effective at hiding fiat field 
bias in low-surface brightness features. 

3. In low surface brightness regimes, we observe stochas- 
tic variations of ±0.2% in the sky amplitude. These 
background shapes cannot be removed with median sky 
frame subtraction (which itself has shape amplitudes on 
the order of 0.3% of the NIR sky level). We find this 
to be the limiting surface brightness accuracy across a 
10' x 10' WIRCam frame. 

11. CONCLUSIONS 

We have presented near-infrared (J and K s ) images of 
M31's entire bulge and disk with CFHT/WIRCam. These 
maps surpass the 2MASS (Beaton et al. 2007) and Spitzer 
(Barmby et al. 2006) mosaics with superior resolution that 
permits the identification of individual stars throughout M3 1 's 
mid and outer disk. The dataset is also complementary to the 
HST/WF3 PHAT survey (Dalcanton et al. 2012) by providing 
complete coverage of M3 1 's disk within R = 22 kpc, and by 
offering a broader NIR colour baseline (J-K s ) than is offered 
by WF3 (approximately J-H). NIR mosaics of M31 have 
crucial applications for studies of the nearly attenuation-free 
stellar structure of our nearest spiral neighbour, and for tests 
of stellar population synthesis models in NIR regimes. 

Our focus in this paper has been the establishment of pro- 
cedures for accurately recovering the NIR surface brightness 
across 3 sq. deg. of the M31 disk using a sky-target nodding 
observing strategy with WIRCam on CFHT. We have com- 
pared two different observing methods to study the effects 
of sky target nodding cadences and patterns on sky subtrac- 
tion uncertainties. We have also developed and tested our 
WIRCam pipeline for flat fielding, zeropoint estimation, me- 
dian sky subtraction, and sky offset optimization. 
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Figure 26. Residual shapes of individual FWlOOk sky flat-processed WIRCam integrations of the M31 disk to the median (mosaic) shape for the field M31-37, 
Xj-band, observed in the 2009B semester. Residuals have been marginalized across the x (left) and (y) (right) axes to provide ID views. Axes match WIRCam's 
2x2 detector footprint. Individual integrations are coloured by their time after the first disk integration. The centre of M3 1 is located in the lower-right detector 
(#2); surface brightness bias in these regions betray the presence of flat field bias. Lower surface brightness regions are dominated by shape variations on the 
order of ±0.2% of sky, interpreted as additive uncertainties associated either with the detector, skyglow, or both. 
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Figure 27. Residual shapes of individual of the M31-37 K s field, processed with QRUN sky flats. Compared to Fig. 26, QRUN sky flats clearly do not capture 
flat field evolution that occurs over the course of 90-minutes, yielding systematically evolving realizations of bulge-dominated surface brightness in detector #2 
(lower-right). In the more sky-dominated regions of the image (detector #4), QRUN sky flats produce images with similar stability to FW1 OK sky flats, indicating 
that the limit of additive uncertainties associated with sky or instrumental background variations is reached here. 
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Figure 28. Cumulative distributions of scalar difference amplitudes between 
individual frames and blocks in the J (top) and K s (bottom) mosaics, mea- 
sured as a dispersion of pixel differences at the 95% level. Whether pro- 
cessed with QRUN (red) or FWlOOk (blue) sky fiats, or observed in 2007B 
(solid lines) or 2009B (dotted lines), the residual amplitude differences be- 
tween frames and blocks are similarly distributed. The expected amplitude 
difference is 0.3% of the J sky brightness (0.2% in K s ). 

We find that our NIR SB reconstruction is limited in two 
regimes: large scale reconstruction of surface brightness with 
sky offsets, and surface brightness shape uncertainties across 
individual WIRCam images. On a large scale, the necessity 
of nodding between sky and target limits our direct knowl- 
edge of the sky level on the disk by > 2% of the sky level. 
Strictly minimizing latency between sky and disk integrations 
(as in the 2009B program) provides only limited improve- 
ment in our knowledge of the sky level on the disk because 
of the overhead in nodding the telescope, and spatial structure 
in the NIR skyglow itself. Sky offset optimization is success- 
ful in reducing block-to-block surface brightness differences 
to < 0.1% of the sky level. Given a realized set of blocks, our 
optimization algorithm reliably finds a consistent solution, so 
any errors in surface brightness shape across the mosaic are 
caused by errors in the shapes of individual blocks (see be- 
low). There is, however, an uncertainty in our net zero off- 
set, of order ~ OA B /\/M>iocks; 0.16% of the sky level. This 
zeropoint can ultimately be established using resolved stellar 
populations, the subject of a forthcoming ANDROIDS study. 

The shape of a WIRCam frame can be affected by both 
flat field uncertainties and additive background uncertainties. 
First, we have discovered that the WIRCam flat field func- 
tion can change by 1% in 60 minutes. This effect is predom- 
inantly influential in the signal-dominated regime of the M3 1 
bulge; though it is also significant for resolved stellar pho- 
tometry. We thus find that constructing real-time sky flats is 
essential for calibrating WIRCam images. In sky-dominated 
regimes, the 2D SB shapes of individual WIRCam frames 
(20' x 20') are uncertain by 0.2% of the sky intensity, centre- 
to-edge. These background fluctuations present a lower limit 
in the surface brightness uncertainty across a WIRCam frame 
since they appear to be caused by skyglow variations in target 
images that cannot be fully corrected with median sky sub- 



traction. 

We now summarize our analysis of the data taking and re- 
duction methods developed in this work, and in doing so, for- 
mulate a set of best practices for similar wide-field NIR sur- 
veys employing sky-target nodding. 

11.1. Recommendations for Conducting a Wide-field NIR 
Survey with Sky-Target Nodding 

We first recommend that the sky-target nodding cadence be 
set to effectively build real-time sky flats, rather than simply 
track sky level evolution. Such a program would involve suf- 
ficient sky frames to build a sky flat within a window of 20-30 
minutes, where the sky is observed in several epochs at differ- 
ent locations in a sampling ring to minimize biases. 

In the K s band this objective is efficiently achieved, since 
the mean sky flux on a WIRCam pixel is 450 ± 80 ADU s" 1 . 
Given r exp = 25 s, a [S 3 T 6 ] 3 program yields the necessary 9 
sky integrations within 20 minutes and a maximum sky-target 
latency of 2 minutes. Each sky flat would be built from obser- 
vations at three sites along the sky ring. 

Lower sky flux in the J band (120 ± 30 ADU s" 1 ) requires 
additional sky integration to achieve comparable sky flat S/N 
as the K s band. Given r exp = 45 s, a [S 4 T 4 ] 5 program yields the 
necessary 20 sky integrations in ~ 40 minutes, with a maxi- 
mum sky-target latency of 2.2 minutes. 

Sky offsets optimization is aided by having more indepen- 
dent blocks covering the target, since our net zero offset as- 
sertion is uncertain by (TAb/V^UoAs- Given that a B cannot be 
reduced, increasing the number of independent blocks (ob- 
served hours or even a night apart to decouple sky and in- 
strumental biases) is the most reliable way to establish the 
absolute surface brightness accuracy of the mosaic. Since sky 
offsets are further biased by any shape errors in blocks (re- 
alized as our inability to diminish block-to-block offsets be- 
low ^0.1% of sky brightness), we propose that blocks be 
interlaced by 50% (so that one detector completely overlaps 
a detector from an adjacent blocks). This interlacing pat- 
tern would thus enable the marginalization of shape errors 
across the entire detector frame. By doubling the number of 
blocks, each with individually halved exposure times, the mo- 
saic could be reproduced with an equivalent net integration 
time. 
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